Week 06: Sentiment Analysis with AI

Using API to AI to create data from text

Published

May 20, 2025

Week 06: Sentiment Analysis with AI

Using API to AI to create data from text

Overview

Continue using text for research with AI

Learning Outcomes

By the end of the session, students will:

  • Gain hands-on experience with sentiment analysis.
  • Have experience integrating NLP in research
  • Think about what is ground truth

Assignment review

Review of transformation options

  • lexicon based counting numbers –> generate a transparent script
  • Machine-learning classifiers, n-grams etc.
  • LLM (transformer-based) one-shot: treating the LLM like a giant classifier: you hand it raw text and ask classes of sentiment. (Deep contextual understanding—word embeddings, attention across the whole sentence etc decide the sentiment.) No separate sentiment lexicon; it’s all encoded in the model weights.
  • You can also take a pretrained LLM and continue training it on thousands of labeled review. See our example guidelines HERE

Let us discuss the concept ground truth (again): AI vs humans vs “truth”

Using APIs

Basics and setup

How to use APIs

Get an API Key (ChatGPT and Claude)

Examples

Simple walkthrough with GDP data – uses World Bank and FRED APIs

Bit harder walkthrough with football data – uses FBREF soccer data. Guess the club for example.

More advanced stuff

More advanced knowledge on APIshow APIs work

Materials

Datasets

  • texts (text_id level)
  • games info (such as results, text_id level)
  • class-ratings (human, AI ratio, text_id*student level)
  • domain-rating (text_id level)
  • class-rating-aggregated (text_id level)

code

Interview Analysis

Code for sentiment analysis of football manager interviews, here: interview scripts

Preparation

  • Download the combined data from Moodle
    • Note: win, draw – need encode loss

Class tasks

Discussion 1

  • Your experience regarding human vs ai ratings.
  • What was difficult and easy as human rater

Data Analysis

  • Take the aggregated file and ask AI for a readme. Discuss what is in the data
  • Compare human, domain lexicon and AI rating. For human and AI take the average.
  • Think of an interesting comparison using AI rating
  • Compare results by human and lexicon rating

Discussion 2

  • What is ground truth

How to integrate AI into research

  • combine data with text
  • think RQ and how you’d use AI

Additional tasks if time permits

predict gender and result

  • Show AI all texts and ask to predict the gender of speaker
  • Show AI all texts and ask to predict the result (manager’s team won, drew, lost)